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1. INTRODUCTION 

Tifinagh is the alphabet of the Amazigh language which is spoken widely by the North African 
people. The alphabet was normalized on October 17, 2001, by the Moroccan Government. This alphabet is 
formally admitted by the International Organization for Standardization (ISO) [1]. Tifinagh-IRCAM has 
33 characters as shown in the Figure 1. 

Handwritten character recognition is one of the most interesting and important area of natural 
language processing. Several researches have been accomplished in this domain in order to provide many 
important services like documents scanning, bank cheques processing, reading postal codes and different 
forms of handwritten documents. In the majority of previous works [2]-[4] Tifinagh handwritten character 
recognition system can be divided into three main phases: i) preprocessing steps: image resizing, 
segmentation and binarization, ii) feature extraction: this step generates features of the character image based 
on its geometrical characteristics, and iii) classification and recognition. 

The major difficulties of different systems can be highlighted during the steps of feature extraction. 
Most of system have problems related to the time of training which is very high and parameters optimization 
of the convolutional neural network (CNN). The feature extraction requires much time and may impact the 
accuracy of the system. In this paper we will present a new CNN model with optimized parameters for the 
Tifinagh handwritten character in order to overcome the above listed difficulties. This will solve the problem 
of characters confusion and training time which we have faced before on previous works [2], [3]. 
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Yak Yakk Yagh 


Figure 1. Tifinagh IRCAM alphabet 


The rest of this paper is formed as follows: the second section introduces a review of previous 
works. Then, CNN architecture is presented in the third section. The suggested new model is described in 
section four. In section five the experimental results are presented. Finally, we will conclude and provide 
suggestions for future research. 


2. PREVIOUS WORKS 

In the literature, numerous convolutional neural networks systems have been developed such as 
tangut character recognition based on deep learning algorithms, in this work Zhang and Han [4]. Have 
constructed a tangut character recognition system. They used a large training dataset which ensure a high 
accuracy of a deep learning system. They achieved an accuracy of 94%. 

Rismiyati et al. [5] presented a new model of Javanese character recognition which uses deep 
learning techniques to classify handwritten Javanese characters. The classification used a dataset of 
2,470 images from 20 characters. The size of the input image is 32x32 pixels. The classification is performed 
by using convolutional neural networks (CNN) and deep neural network (DNN). They obtained an accuracy 
of 70.22% with k-fold cross validation and 64.65% for CNN and DNN. 

Zhang et al. [6], introduced new advances on implementing deep learning methods for handwritten 
Chinese character recognition and handwritten Chinese text recognition. They eliminated the need for data 
augmentation and model ensemble. By using deep learning methods with old approaches, they were able to 
achieve state-of-the-art performance for both systems. 

Jindal et al. [7] proposed a new method based on deep convolutional neural networks. They used a 
dataset of 35 Gurumukhi different characters. Experimental results showed an accuracy of 98.32% for the 
training dataset, and 74.66% on the test data. 

Tifinagh character recognition has become an active field of research in the last decade because of 
its introduction in the education, industry, and government institutions. Traditional systems involve many 
different processes including character images preprocessing, database preparation (features extraction), 
generation of best features and classification. Niharmine et al. [3] have proposed a new enhanced feature 
extraction based on genetic algorithms. The proposed system achieved good results with better features. The 
classification phase is performed using a feedforward neural network. 

Ouadid et al. [8] presented a model built with the graph theory. They used Harris corner detector 
method to extract the interest points. They built the graph model representation of Tifinagh characters based 
on the extracted points. The classification phase was done by computing the spectral properties calculation of 
the adjacency matrix that represents the affinity of conformity between graphs. The system proves a 
recognition rate of 99.02%. 

Amrouch et al. [9] proposed an optical character recognition (OCR) system for Tifinagh using a 
crossbreed approach by merging the Hough transform and hidden Markov models. After binarization, 
segmentation and resizing of the image, the final vector of the image character is constructed from the Hough 
transformation. This vector is transformed into a sequence of observations that is used for the classification 
and recognition phase. In the end, they apply the forward classifier to recognize the handwritten character. 
They obtained interesting results during the testing phase using a local database. 

Oulamara and Duvernoy [10] extracted features of straight segments using the Hough transform to 
extract with their attributes (length and orientation). Features vectors were generated by analyzing the 
characters in the parametric space. The adopted method achieved interesting results during tests with the 
local database. Classical methods are so weak in terms of performance. Researchers have realized few 
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projects using enhanced deep learning techniques. Tifinagh handwritten character recognition related works 
using deep convolutional neural networks (DCNN) are very few. Researchers have just started applying it in 
the last three years. Benaddy et al. [11] proposed a new CNN system tested on the amazigh handwritten 
character database (AMHCD) [12] dataset and achieved a recognition accuracy of 99.10%. The CNNs 
extracts features directly from raw pixels. They use a CNN system of 5 adjacent layers. The first three layers 
compute features extraction and the two remaining layers execute classification step. 

Sadouk et al. [13] have developed a new system using two CNN architectures: deep belief networks 
(DBNs) and CNNs. The authors used the AMHCD database to train and test the two networks. Experimental 
tests show an accuracy of 95.47% while CNNs perform an accuracy of 98.25%. 

The major issues faced by previous research are mainly: first, they need to perform preprocessing 
steps that requires a significant time, second, they can face problems of confusion between many characters 
such as : ‘Yaz’ and ‘Yazz’, ‘Yay’ and ‘Yag’, ‘Yadd’ and ‘Yatt’, third the time of recognition or classification 
is too much big, fourth the majority of these systems use only 31 characters instead of 33 characters. In order 
to resolve these difficulties and issues we will apply a new technique based on Keras neural networks library 
to classify and recognize Tifinagh character. The Keras is a machine learning open source code library 
released by Francois Chollet on March 27, 2015. It has been being widely used in computer vision, especially 
the field of pattern recognition. 


3. THE PROPOSED METHOD 
3.1. Convolutional neural networks 

Convolutional neural networks are very similar to classical neural networks. They are composed of 
neurons with learning weights and prejudices. Each neuron receives some input, executes a dot product, and 
optionally tracks it non-linearly. The term "deep neural network" refers to the number of the term deep neural 
network refers to the number of hidden layers. For a normal neural network, it usually uses just one hidden 
layer, and deep related to multiple hidden layers. The multiple hidden layers between the raw input data and the 
output label allow the network to learn features at various levels of abstraction, making the network itself able to 
make features extraction. LeNet [14] is the first CNN built by Lecun in 1998. 

Regular neural networks pickup an input vector and send it via a sequence of hidden layers. Each 
hidden layer is made up from a group of neurons, where each neuron is completely connected to all or any 
neurons within the preceding layer, and where neurons during a single layer function completely 
independently and don't share any connections. The last layer, which is fully connected, is named the “output 
layer” and in classification parameters it serves as the category scores. 

Convolutional neural networks read input images directly and limit the architecture more 
sensitively. ConvNet layers have neurons in three dimensions: width, height, depth. The depth here refers to 
the third dimension of an activation volume. For a red, green, blue (RGB) type image, the depth is 3. The 
final output layer would have dimensions of 1x1xclass since an evaluation vector for a single class is created 
at the end of the ConvNet architecture. The Figure 2 represents the architecture of convolutional neural 
networks. 


3.2. ConvNets layers 

A simple ConvNet is a sequence of layers. Every layer of a ConvNet transforms one volume of 
activations to a new one through a differentiable function. ConvNet architectures have three important types 
of layers: pooling layer, convolutional layer and fully connected layer. We will assemble these layers to 
construct a full CNN architecture. 

For an RGB image with size 32x32 the ConvNet layers details are as follows: i) INPUT [32x32x3] 
will take the data pixel of the image, in CNN case the image has dimensions of 32x32 with three colors 
R.G.B, ii) CONV layer will calculate the output of neurons that are linked to local regions in the input, each 
computing a dot product between their weights and a small region they are connected to in the input volume. 
The result is a volume such as [32x32xk] if we want to use k filters, iii) rectified linear unit (ReLU) layer 
computes a threshold operation to each element of the input, like the max(0,x) thresholding at zero which 
makes the volume size unchanged ([3232xk]), iv) pooling layer will compute a downsampling operation on 
the spatial dimensions (width, height), with output of volume like [16x16xk], and v) fully Connected layer 
will compute the class scores, with a volume result of size [1x1xclass]. 

The filters k act as feature detector from the original input image. Then, a non-linearity function is 
then computed to the result of the convolutional operation to achieve the so-called activation map (also 
named feature map). Many projects and research have built different CNN architectures for classification. 
Some of the most important deep CNN networks are AlexNet [15], visual geometry group (VGG) networks 
[16], and region-based convolutional neural networks [17]. 
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Figure 2. Convolutional neural network architecture 


4. RESEARCH METHOD 

We have used Keras library to implement our CNN model. We just add one layer at a once starting 
from the beginning of the CNN (the input). The first layer is a convolutional layer (Conv2D). It is composed 
of learnable filters. We adjust the number of filters to 32 for the two first layers and 64 filters for the two 
second layers and 128 filters for the last two layers. Each filter converts a part of the image using the kernel 
filter. The kernel filter matrix is used on the whole image. Filters can be considered as a transformation of the 
image. The advantage of our model is that CNN can extract features that are useful in each place from the 
transformed images (features map). 

The second key layer in our CNN model is the pooling layer (MaxPool2D). The role of this layer is 
a downsampling filter. It picks the maximal value from the two neighboring pixels. This operation is 
computed to reduce the computational cost and lower overfitting. The pooling size is well chosen. The 
downsampling become important when the pooling dimension is high. 

The purpose of our architecture is to associate convolutional and pooling layers, CNN are adequate 
and able to couple local features and extract more global features of the image. ReLU is the activation 
function max (0, x) as shown in (1). The activation function is used to add nonlinearity to the network. The 
used function is ReLU. 


_ (0 forx <0 
f(x) aes 0 (1) 


The flatten layer is used transform the final feature maps into one single vector. This flattening 
phase is applied in order to use a fully connected layer after convolutional/maxpool layers. It assembles all 
the local features of the previous convolutional layers. 

Finally, we employ the features in the two fully connected (Dense) layers which is an artificial 
neural networks (ANN) classifier. In the last layer, a dense layer with 33 outputs and activation function 
SoftMax, the net outputs values of probability of each class. The CNN architecture has 291 073 trainable 
parameters as shown in the Table 1. 


Table 1. Proposed architecture parameters 


Layer (type) #kernels Kernel/pool size Output shape #param 
Image (InputLayer) — — (1, 3, 28, 28) 0 
Convl (Conv2D) 32 5x5 (32, 28, 28) 832 
Conv2 (Conv2D) 32. 5x5 (32, 28, 28) 25 632 
Pooll (MaxPooling2D) — 2X2 (32, 14, 14) 0 
Conv3 (Conv2D) 64 3x3 (64, 14, 14) 18 496 
Conv3 (Conv2D) 64 3x3 (64, 14, 14) 36 928 
Pool2 (MaxPooling2D) - 2x2 (64, 7, 7) 0 
Dense128 (Dense) — = (128) 204928 
Dense33 (Dense) - - (33) 4257 
Output (SoftMax) - = (33) 0 
Output (SoftMax) — = (33) 0 
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5. RESULTS AND DISCUSSION 
5.1. Dataset preparation 

We have built a new Amazigh data set similar to the modified national institute of standards and 
technology (MNIST) dataset format from the Amazigh handwritten character database (AMHCD). 
Handwritten character images were converted to 28x28 images and transformed to csv format using python 
Image library. The purpose of this operation is to get our data ready for training and testing by CNN using 
Keras library. The split of dataset is described in the Table 2. 


Table 2. Dataset partition 


Partition Number of characters 
Training images (75%) 19 968 
Validation images (25%) 4992 
Total 24 960 


The first step is to load data converted dataset images training images, testing images and labels. 
Labels are 33 characters from 0-32. Then we compute a grayscale normalization to reduce the effect of 
illumination's differences. The next step is to reshape image in 3 dimensions (height=28 px, width=28 px, 
canal=1). The data visualization can be performed as shown in the Figure 3: 
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Figure 3. Visualization of character features 


5.2. Results and discussions 

The system was constructed using Keras CNN model with TensorFlow as backend. We used 
60 epochs to train the DCNN. The new approach outputs the best performance without any preprocessing 
step (such as in [18]-[23]), The training process requires about 2,491 seconds to reach the maximum 
accuracy 99.37% at epoch 47 as shown in the Figure 4. We can conclude that we have achieved a very high 
classification accuracy and very low loss rate. Nevertheless, the prediction of some characters was wrong 


during the classification phase. This is due to the similarity between some character like (© and O ) and 
(K and K"). 
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Figure 4. The accuracy and the loss of the proposed architecture (60 epochs) 
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We have generated the confusion matrix as shown in Figure 5. We will use it to summarize the 
performance of our classification algorithm. The generated matrix in Figure 5 is a summary of prediction 
results on our classification problem. The number of correct and incorrect predicted characters are 
summarized with count values and broken down by each class. As we can remark in the table summary of the 
matrix the number of misclassified characters is only 4, a small number compared to the predicted ones. The 
wrong classification is due to the format of character which is not well written during the construction of 
AMHCD database. 
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Predicted Characters 


Figure 5. Confusion matrix of the proposed model 


5.3. Optimization of the hyperparameter learning rate 

In our model we have made different tests with different learning rate values. The learning rate 
parameter is the most crucial hyperparameter when configuring a neural network. It supervises how much to 
change the model in response to the estimated error each time the model weights are updated. Deciding the 
value of the learning rate is a difficult and challenging task. The Table 3 provides results of the proposed 
architecture at different learning rate. The optimal learning rate parameter is 0.009 with training accuracy 
99.27%. 


Table 3. Tests result based on different values of learning rate parameter 
Learning rate Accuracy Epoch Time (Seconds) 


0.001 99.22% 60 3420 
0.009 99.27% 38 1976 
0.0006 99.29% 50 2600 
0.0005 99.37% 47 2491 
0.0004 99.37% 57 3363 
0.0003 99.35% 58 3190 
0.0007 99.23% 48 2832 
0.0008 99.31% 44 2376 
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5.4. Comparison of the achieved results with other previous works 

The proposed system has shown good result comparing it with previous works. The best training 
accuracy with a good training time is and 99.27% as shown in the Table 4. The use of Keras library and the 
optimization of the hyperparameter learning have led to build an improved character recognition system with 
good accuracy and very good time of execution. 


Table 4. Comparison of the achieved results with other previous works 


Previous work Number of Image used from the AMHCD Training size Test size Accuracy 
Geometrical methods [21] 1 700 1000 700 92.30% 
Baselines Features [24] 24 180 21 762 2418 94.96% 
HMMs features [19] 20 180 16 120 8,060 97,89% 
Fusion of Classifiers Neural Networks 165 33 33 81.21% 

and Support Vector Machine [20] 

MLP and HMM [25] 7200 1800 5400 92.33% 
CNN DBN [13] 24 180 - 98.25% 
CNN [11] 25 740 20 592 5 148 99.10% 
Proposed system 25 740 19 305 6 435 99.27% 


6. CONCLUSION 

In this paper, we have built a new optimal Tifinagh handwritten character recognition system based 
on optimized deep convolutional neural networks. The system was trained for recognizing the 33 characters 
using AMHCD dataset. Experimental tests are conducted with 33 class cross validation. The system 
outperforms all traditional works by solving issues of slowness and confusion between some characters. The 
experiment result shows CNN model is able to achieve the best training accuracy of 99.27%. 

The Tifinagh character recognition system still has different challenging problems that need to be 
solved. For example, the similarity between some characters of the amazigh handwritten character database 
(AHCD) databases that leads to wrong prediction and the training time should be reduced. In the 
perspectives, we plan to improve the training time and solve confusion problems for composed characters by 
introducing new improvements on the AHCD database and the CNN architecture. 
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